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Abstract 

Dietary proteins are l<nown to contain bioactive peptides that are released during digestion. Endogenous proteins secreted 
into the gastrointestinal tract represent a quantitatively greater supply of protein to the gut lumen than those of dietary 
origin. Many of these endogenous proteins are digested in the gastrointestinal tract but the possibility that these are also a 
source of bioactive peptides has not been considered. An in silico prediction method was used to test if bioactive peptides 
could be derived from the gastrointestinal digestion of gut endogenous proteins. Twenty six gut endogenous proteins and 
seven dietary proteins were evaluated. The peptides present after gastric and intestinal digestion were predicted based on 
the amino acid sequence of the proteins and the known specificities of the major gastrointestinal proteases. The predicted 
resultant peptides possessing amino acid sequences identical to those of known bioactive peptides were identified. After 
gastrointestinal digestion (based on the in silico simulation), the total number of bioactive peptides predicted to be released 
ranged from 1 (gliadin) to 55 (myosin) for the selected dietary proteins and from 1 (secretin) to 39 (mucin-5AC) for the 
selected gut endogenous proteins. Within the intact proteins and after simulated gastrointestinal digestion, angiotensin 
converting enzyme (ACE)-inhibitory peptide sequences were the most frequently observed in both the dietary and 
endogenous proteins. Among the dietary proteins, after In silico simulated gastrointestinal digestion, myosin was found to 
have the highest number of ACE-inhibitory peptide sequences (49 peptides), while for the gut endogenous proteins, mucin- 
5AC had the greatest number of ACE-inhibitory peptide sequences (38 peptides). Gut endogenous proteins may be an 
important source of bioactive peptides in the gut particularly since gut endogenous proteins represent a quantitatively 
large and consistent source of protein. 
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Introduction 

The main role of dietary proteins is to provide amino acids for 
body protein synthesis [1]. However, investigations over the last 
two decades have shown that dietary protein can also be a source 
of latent bioactive peptides (from 2 to greater than 40 amino acids 
long) that when released during digestion in the gastrointestinal 
tract can act as modulators of various physiological functions 
[2,3,4] . These peptides are reported to possess a range of effects 
including antihypertensive, cholesterol-lowering, antioxidant, an- 
ticancer, immunomodulatory, antimicrobial, opioid, antiobesity 
and mineral binding effects [2,5,6]. The most extensively studied 
dietary sources of these bioactive peptides include milk, egg, meat, 
soya and cereal proteins [2,3,7,8]. The bioactive peptides released 
during the digestion of dietary proteins are believed to act either 
within the gastrointestinal tract or are possibly absorbed into the 
bloodstream where they may act systemicaUy [3,9,10,1 1]. 

The supply of dietary proteins, and therefore the supply of 
gastrointestinal bioactive peptides derived from those proteins, wiU 
likely be highly variable as humans do not consume the same foods 
or amounts of food on a day to day basis. However, a considerable 
amount of endogenous (non-dietary) protein is also present in the 
lumen of the gastrointestinal tract during digestion, consisting of 
proteins such as mucins, serum albumin, digestive enzymes. 



protein within sloughed epithelial cells and microbial protein, and 
this material may be a source of bioactive peptides [12]. When 
compared to dietary protein, gut endogenous proteins represent a 
larger and more constant supply of protein in the gastrointestinal 
tract [13,14,15,16], with endogenous nitrogen entering the 
digestive tract of humans being quantitatively equal or greater 
than the dietary nitrogen intake [16,17,18,19,20]. In a study 
conducted using pigs fed a casein-based diet, it has been reported 
that up to 80% of endogenous proteins are digested and 
reabsorbed by the end of the small intestine [14]. During digestion 
a wide range of endogenous protein-derived peptides are likely to 
be generated, but the biological activity of such endogenously 
sourced gut peptides has not yet been considered. Potentially, gut 
endogenous proteins could be an important source of gut bioactive 
peptides given the amount of endogenous proteins present in the 
gastrointestinal tract. This study aimed to use an in silico approach 
to investigate whether known bioactive peptide sequences are 
present within the amino acid sequences of endogenous proteins 
secreted along the gastrointestinal tract and whether these 
bioactive peptides may potentially be released during enzymatic 
digestion in the human gastrointestinal tract. To our knowledge, 
the present study is the first to show that the amino acid sequences 
of gut endogenous proteins hold within them abundant bioactive 
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peptide sequences and that the possibility exists that these peptides 
are released during gastrointestinal digestion. 

Methods 

Twenty six launMi human gut endogenous protc'ins with well 
characterised amino acid sequences were (examined. Additionally, 
7 dietary proteins, which have been reported to contain bioactive 
peptides, were also examined [2,21,22,23,24]. The proteins 
analysed are shown in Table 1. 

Prediction of the Total Number of Bioactive Peptide 
Sequences Present in the Intact Gut Endogenous and 

Dietary Proteins 

To predict the number of bioactive peptide sequences encoded 
within the gut endogenous and dietary proteins, the amino acid 
sequence of each protein was obtained from an online protein 
database [25]. The amino acid sequence of each protein was 
examined for the presence of known bioactive peptide sequences 
using an online bioactive peptide database [22]. The latter 
database contained the amino acid serjuences of 2609 known 
bioactive peptides with 48 different bioactivities known to be 
bioactive based on either in vitro or in vivo studies [22,26]. The 
bioactive peptide database was used according to the instructions 
given and peptides possessing one or more of the following 
bioactivities were identified [antiamnestic, angiotensin converting 
enzyme (ACE)-inhibitor, antithrombotic, stimulating (glucose 
uptake-, -vasoactive substance release), regulating (ion flow-, 
stomach mucosal membrane activity-, phosphoinositol mechanism 
peptide-), antioxidative, bacterial permease ligand, inhibitor 
(dipeptidyl peptidase IV-, dipeptidyl-aminopeptidase IV-, dipepti- 
dyl carboxypeptidase-, cyclic nucleotide phosphodiesterase 1 
(CaMPDE)-, neuropeptide-), hypotensive, activating ubiquitin 
mediated proteolysis]. The total number of bioactive peptide 
sequences identified in the intact proteins was recorded for each 
gut endogenous and dietary protein. 

The bioactive peptide frequency (A) and the relative bioactive 
peptide frequency (Y) are often used to describe the potency of 
proteins as sources of bioactive peptides [22,27]. In the present 
study the frequency of bioactive peptide sequences within the 
intact protein (A^) was calculated as follows: 

A„=^x 1000 

where, ao is the total number of identified bioactive peptides 
present in the protein or the number of bioactive peptides with a 
specific activity based on the BIOPEP database [22], N is the total 
number of amino acid residues within the protein. 

The relative frequency of occurrence of bioactive peptides with 
a specific activity (Yj)[%]: 

where, Aqj is the number of peptides with a specific activity, 1 is the 
total number of peptide sequences across all activity categories 
present within the protein, j is the specified activity. 



Prediction of the Frequency of Potential Bioactive 
Peptide Sequences Present in the Gastrointestinal Tract 
After Simulated Digestion in the Stomach, Stomach and 
Small Intestine and Small Intestine Alone (Ap) 

^5=^ X 1000 
N 

where, an is the number of identified bioactive peptides present 
after the simulated {in silico) digestion and N is the total number of 
amino acid residues within the protein. 

A prediction of the number of bioactive peptides that would be 
released from gut endogenous proteins and dietary proteins after 
upper gastrointestinal tract digestion was made using an in silico 
simulation based on the amino acid sequence of the proteins and 
the reported specificity of the major proteases present in the 
gastrointestinal tract. The site of secretion of the gut endogenous 
proteins was also taken into account. For the gut endogenous 
proteins secreted in the mouth and stomach and for the dietary 
proteins, gastric digestion was simulated in silico based on the 
amino acid sequence of the dietary or gut endogenous protein and 
the specificity of pepsin as documented by KeU [28,29]. Gastric 
and small intestinal digestion was predicted based on the specificity 
of pepsin, trypsin and chymotrypsin as documented by KeU 
[28,29]. For endogenous proteins secreted in the small intestine, 
only small intestinal digestion was simulated taking into account 
the reported specificity of trypsin and chymotrypsin only. The 
amino acid sequences of the endogenous and dietary proteins were 
obtained from a protein sequence database as described above 
[25]. The in silico simulated digestion was conducted using an 
online Peptide Cutter tool application [28]. The amino acid 
sequence of each of the predicted resultant peptides for each of the 
gut endogenous and dietary proteins was then compared to the 
amino acid sequence of known bioactive peptides using an online 
bioactive peptide sequence database [22]. 

Results 

The Total Number and Frequency (Ao) of Bioactive 
Peptide Sequences within the Amino Acid Sequence of 
Intact Gut Endogenous Proteins and Intact Dietary 
Proteins 

Among the dietary proteins studied, the amino acid chain 
length of the proteins varied from 209 (fS-casein) to 1939 (myosin) 
amino acids, while for the gut endogenous proteins the range was 
from 80 (human gastrin) to 5159 (human mucin-2) amino acids 
(Table 1). The total number of bioactive peptide sequences 
identified and their corresponding potential bioactivities, within 
the amino acid sequences of the intact gut endogenous and dietary 
proteins are shown in Table 2. In addition, the Aq values for each 
activity and for all the activities considered along with Y values for 
each of the proteins are also shown. 

The total number of bioactive peptides, present within the 
amino acid sequences of the intact protein molecules for the gut 
endogenous proteins, ranged from 46 peptides for somatostatin to 
2507 peptides for Mucin-5AC (Table 2). When based on the 
subclasses of proteins presented in Table 1, the total number of 
identified bioactive peptide sequences present within the amino 
acid sequences of the intact protein molecules ranged from 142- 
2507 for the mucins, 339 for serum albumin, 125-268 for the 
digestive enzymes, 46-86 for the hormones and 68-223 for the 
remaining "other" proteins. For the dietary proteins, the total 
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Table 1. Gut endogenous and dietary proteins 


examined in the in silico studyV 






Protein/peptide classified based on the 
function In the body, with accession number^ 


Site of secretion within 
gastrointestinal tract 


Chain length of mature protein^ 
(No. of amino acid residues) 


Lubrication, maintenance of integrity of 
tissue lining, cell signaling, immunity 


Mucin-2 (Q02817) 


Small Intestine and colon 


5159 


Mucin-3A (Q02505) 


Small intestine 


2520 


Mucin-3B (Q9H195) 


Small Intestine and colon 


901 


Mucin-5AC (P98088) 


Stomach, oesophagus and 
proximal duodenum 


5003 


Mucin-6 (Q6W4X9) 


Stomach 


2417 


Mucin-7 (Q8TAX7) 


Salivary gland -mouth 


355 


Mucin-13 (Q9H3R2) 


Stomach, small Intestine 
and colon 


493 


Mucin-15 (Q8N387) 


Small Intestine and colon 


311 


Mucin-20 (Q8N307) 


Throughout the gut 


684 


Maintenance of colloid osmotic pressure 
and acid-base balance and transport 


Serum albumin (P02768) 


From plasma into 
stomach and intestine 


591 


Enzymes in digestion 


Chymotrypsinogen B {PI 7538} 


Pancreas 


245 


Chymotrypsinogen B2 (Q6GPI1) 


Pancreas 


245 


Gastric triacylglycerol lipase (P07098) 


Stomach 


379 


Pancreatic amylase (P04746) 


Pancreas 


496 


Pancreatic triacylglycerol lipase (P16233) 


Pancreas 


449 


Pepsin (P00790) 


Stomach 


373 


Salivary amylase (P04745) 


From salivary gland 
into mouth 


496 


Trypsin (P07477) 


Pancreas 


232 


Hormones 


Cholecystokinin (P06307) 


Small Intestine 


95 


Gastrin (P01350) 


Stomach, duodenum, pancreas 


80 


Promotilin (PI 2872) 


Small Intestine 

(also affects gastric activity) 


90 


Secretin (P09683) 


Duodenum 

(also affects gastric pH) 


103 


Somatostatin (P61278) 


Stomach, intestine, pancreas 


92 


Other protelns/peptldes involved in 
the regulation of specific processes 
In the digestive tract/Immunity 


Gastric inhibitory peptide (P09681) 


Stomach 


132 


Gastric Intrinsic factor {P27352) 


Stomach 


399 


Lysozyme C (P61626) 


Throughout the gut 


130 


Dietary proteins 


P-casein, Bovine milk (P02666) 




209 


Gliadin, Wheat (P02863) 




266 


Glutenin, Wheat (PI 0385) 




337 


Glycinin, Soya (P04347) 




492 


Ovalbumin, Chicken egg {P01012) 




386 


Actin^ chicken meat (P60706) 




375 


Myosin^ chicken meat (PI 3538) 




1939 



'Compiled from the UniProtKB Protein Database [25]. 
^The given chain length excludes signal peptide. 

^Initiator methionine not removed from the intact protein sequence (chain length inclusive of the initiator methionine). 
doi:l 0.1 371 /journal.pone.0098922.t001 
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number of identified bioactive peptide sequences present within 
the amino acid sequence of the intact proteins ranged from 148 for 
glutenin to 1072 for myosin. 

Among the observed bioactivit}' categories, angiotensin con- 
verting enzyme (ACE)-inhibitor^' peptide secjuences were present 
in the largest numbers for all of the examined dietary and gut 
endogenous proteins with Y ranging from 43% for gastrin to 75% 
for mucin-2 for the gut endogenous proteins and from 44% for 
gliadin to 67% for actin. For the gut endogenous proteins, the Aq 
for the ACE-inhibitory peptide sequences ranged from 212 for 
mucin- 3A to 485 for secretin while for the dietar)' proteins Aq for 
the ACE-inhibitory peptide sequences ranged from 243 for 
glutenin to 608 for (J-casein. 

In addition to the 10 most abundantly observed bioactive 
peptide categories presented in Table 2, peptide sequences 
reportedly possessing other bioactivities were also observed in a 
few select proteins. For example, opioid peptide sequences were 
present within the amino acid sequences of all of the dietary 
proteins but only a few of the endogenous proteins. Similarly, 
coeliac toxic peptide sequences were present wdthin the amino acid 
sequences of the wheat proteins gliadin and glutenin only (data not 
shown). 

Predicted Number and Frequency (Aq) of Bioactive 
Peptides Released After Gastric Digestion of Dietary 
Proteins and Gut Endogenous Proteins Based on an in 
silico Simulation 

The number of bioactive peptides (and their corresponding 
predicted bioactivities) predicted to be released after gastric 
digestion of gut endogenous proteins secreted in the mouth and 
stomach and of dietary proteins based on an in silico simulation of 
gastric digestion are presented in Table 3. The total number of 
bioactive peptides predicted to be released after gastric digestion of 
the gut ciidogx'nous proteins ranged from 0 to 12 bioactive 
peptides per protein molecule for lysozyme C and serum albumin 
respectively. When grouped into the protein subclasses shown in 
Table 1, the total number of predicted bioactive peptides after 
digestion was 1-11 peptides per molecule for the mucins, 12 for 
serum albumin, 2-8 for the digestive enzymes, 0-2 for the 
hormones and 0^ for the "other" proteins. For the dietary 
proteins, between 1 (glutenin and ghadin) and 1 1 (myosin) 
bioactive peptides were predicted to be released per protein 
molecule after gastric digestion. When the number of predicted 
bi()acti\-c peptides wns presented in relation to the number of 
amino acids in each protein, the Ad value for the mucins, serum 
albumin, digestive enzymes, hormones and "other" proteins was 
1-6, 20, 4-21, 0-22 and 0-10 respectively. For die dietary 
proteins, the Ajj value ranged from 3 for (glutenin and actin) to 14 
for (P-casein). 

Bioactive peptides with ACE-inhibitory activity were predicted 
to be present after gastric digestion in higher numbers compared 
to peptides in the other activity categories with a total of 5 1 ACE- 
inhibitory peptides predicted to be present post-digestion across all 
of the examined proteins as compared to 0-22 predicted peptides 
for all of the other activity categoric-s. Serum albumin and myosin 
were predicted to yield the largest number of ACE-inhibitor^' 
peptides after peptic digestion with 8 ACE-inhibitory peptides per 
protein molecule. This was closely followed by 7 ACE-inhibitory 
peptides for mucin-5AC. Considerably fewer ACE-inhibitory 
peptides were predicted (0^ peptides per molecule) for the 
remaining gut endogenous and dietary proteins. The other 
predicted bioactivities with identified peptides were stimulating 
(glucose uptake-), inhibitor (dipeptidyl peptidase IV-, dipeptidyl- 



aminopeptidase FV-), and antioxidative activities and activation of 
ubiquitin mediated proteolysis. 

Predicted Number and Frequency (Aq) of Bioactive 
Peptides Released After Gastric and Small Intestinal 
Digestion of Dietary Proteins and Gut Endogenous 
Proteins Based on an in silico Simulation 

The total number of bioactive peptides predicted to be released 
after gastric and small intestinal digestion in silico for the gut 
endogenous proteins secreted into the mouth and stomach, and 
that therefore underwent digestion in the stomach and small 
intestine, ranged from 1 peptide per protein molecule for secretin 
to 39 peptides per protein molecule for mucin-5AC (Table 4). 
When the proteins were divided into subclasses based on their 
functions as shown in Table 1, the predicted bioactive peptides 
released per protein molecule were 2-39 for the mucins, 22 for 
serum albumin, 4-15 for the digestive enzymes, 1-5 for the 
hormones and 3-10 for the "other" proteins. For the dietary 
proteins, the predicted number of bioactive peptides released after 
digestion (in silico) ranged from 1 for gliadin to 55 for myosin. 
When the size of the proteins were taken into account, the 
predicted Aq value for the mucins, serum albumin, digestive 
enzymes, hormones and "other" proteins was 3-17, 37, 11-40, 
10-56 and 23-31 respectively. For the dietary proteins, the 
predicted Ao value ranged from 4 for gliadin to 38 for P-casein. 

After in silico simulated gastric and small intestinal digestion, the 
most abundant bioactive peptides predicted to be present were the 
ACE-inhibitory peptides ranging from 1 peptide per protein 
molecule for secretin, somatostatin, gastric inhibitory peptide and 
gliadin to 38 for mucin-5AC. Other gut endogenous proteins from 
which notable amounts of ACE-inhibitory peptides were predicted 
to be released were mucin-6 (17 peptides per molecule), serum 
albumin (17 peptides per molecule) and gastric triacylglycerol 
lipase (10 peptides per molecule). Among the food proteins 
evaluated, myosin was predicted to yield the greatest number of 
ACE-inhibitor)' peptides (49 peptides per molecule). Other 
bioactive peptides predicted to be present after gastric and small 
intestinal digestion (based on an in silico simulation) across all 
proteins were glucose uptake-or vasoactive substance release- 
stimulating (0^ per molecule), dipeptidyl peptidase IV- or 
dipeptidyl-aminopeptidase IV-inhibitor (0-5 peptides per mole- 
cule), antioxidative (0—5 peptides per molecule), ion flow- or 
stomach mucosal membrane activity- regulating (0-1 peptides per 
molecule), and hypotensive (0-1 peptides per molecule) peptides 
and peptides activating ubiquitin mediated proteolysis (0-1 
peptides per molecule). 

Predicted Number and Frequency (Aq) of Bioactive 
Peptide Sequences Released After Small Intestinal 
Digestion of Gut Endogenous Proteins Secreted in the 
Small Intestine Based on an in silico Simulation 

For endogenous gut proteins that are secreted into the small 
intestine (for example, the pancreatic enzymes and small intestinal 
mucins) and therefore would not be subject to digestion in the 
stomach, an in silico analysis of the bioactive peptides that would be 
predicted to be released after intestinal digestion alone was 
performed (Table 5). For these proteins mucin-2, serum albumin 
and pancreatic amylase had the greatest predicted numbers of 
bioactive peptides released, with 24, 14 and 14 bioactive peptides 
respectively per molecule; while secretin had the least (1 peptide 
per molecule). Within the subclasses of proteins based on protein 
function and presented in Table 1, the predicted number of 
bioactive peptides released after digestion was 2-24 peptides per 



PLOS ONE I www.plosone.org 



8 



June 2014 I Volume 9 | Issue 6 | e98922 



Gut Endogenous Proteins-Derived Bioactive Peptides 



o 

J' 



w 

a a 



^ ^ ^ CO 



T— T— ^ LT) t— ^ 



E ^ 



Q. 

T3 



o < 



00 ■— m \o 



Q. 

cu 

Q. 

o 
c 

E 



:5 %. 



at 
£ 



o 
15 



5 r 



fo — 

c ^ 

.E ^ 

a >. 

o -a 

^ % 

O fo 

.gi ^ 

^ "O 

OJ ^^=i 
Q. 

T3 ^ 

OJ a; 

- o 

OJ la 

-Q ^- 



a* 

■M C 

-a c 
a — 
^ IB 



>- ^ CO 



■M -ji; 



-a 



OJ at 
Q. u 



Q. 

15 



I s 

D E 

U t 

O n: 

"o "o 

>^ ^ 

u a 

F -° 

S E 



^ 5 -S 

« o ^ 

3 +-' at 

^ "ra ° 

•S >- o S 

U O ^ ^ 

TO <u 

S 5. I ° 

d; a i 



TO 



Q. OJ ,„ u-i 

0) ^ -o 12 

"^ £ OJ "t; 

O T= Q- 

U "3 cu 

3 g. a Q. 

ss s; 



E I -g 

(TJ o t; 

O E Q. 

I -r <U -M O ■ 

:S o S ^ 

i < 5 ^ g Si .T] 



01 m 



PLOS ONE I www.plosone.org 



9 



June 2014 | Volume 9 | Issue 6 | e98922 



Gut Endogenous Proteins-Derived Bioactive Peptides 



o 
!5 



Is 



o 
z 



rN ^ 



0^ m (N (N 



rM <— m t- 



■s i « S 

t; 1=^ a — 

"! -a c 

<u 01 — 
QJ 

§ ■= £ ™ 

£ ^ r E 

£ £ ° "S 

£ ■£ 9 



a -9 



_ c >ti Q. 

K ™ I 'I 

i ^ - .2 

> „T QJ _0 



m m <— 



-i^ c c *i ^ 



= s ^ -° 

^ "O 

■So"' 



■M - Q. Q. 



T— m 



>— >— vD m 



iii cu oi 
^ > X 



! -i I D 2 I 

OJ n '1' 

Si ^ <u "-l ° 

? ^ -S Si •B ^ 

§^ 

OJ 5 fp 9- o 

> ,„ ^ 9^ F p 



d £ 
o ^ 



a- c 

OJ _ 
^ TO 



0 <=; ra 01 
^ 8-^ a- 

£ In 



^ > 



Q- -a 

■4- I' 
o tj 



OJ 



01 

o 



E 



o p 
E i! 



5 01 .ii. 



PLOS ONE I www.plosone.org 



10 



June 2014 | Volume 9 | Issue 6 | e98922 



Gut Endogenous Proteins-Derived Bioactive Peptides 



o 



W eft 



o 

z . 



\0 vD 



rN ^ T— 



T— a\ ro <— 



° -a 



O) CD 



O -D S 

o Si 

.£ -a c 

o 'I' _ 

O 0) 

^ "DO) 



— ^ ^ 



E 

s; E 

o 
E 



S! 2 £ 



^ is 

O ^ Q_ 

.£ c S 
E o ™ 



O 



ro ^ <— 



=s & 

^ o o 

I E £ 



u 
E 



5 S 



£ 



i2 -a 

3 -4=: 
O) Q. 



^- o 



Q. -D 

o G 

I? 
E ^ 

3 



_ O 



2 u Q- 



Q. ^ O +-' 

J- in ™ ,P- -S 



PLOS ONE I www.plosone.org 



11 



June 2014 | Volume 9 | Issue 6 | e98922 



Gut Endogenous Proteins-Derived Bioactive Peptides 



molecule for the mucins, 14 peptides per molecule for serum 
albumin, 4—14 peptides per molecule for the digestive enzymes 
and 1-2 peptides per molecule for the hormones and 2 peptides 
per molecule for lysozyme C. The corresponding Aq values were 
5-10 for the mucins, 24 for serum albumin, 1 3-28 for the digestive 
enzymes and 10-25 for the hormones and 15 for lysozyme C. 

Discussion 

AU of the protein amino acid sequences were sourced from the 
UniProt Protein Knowledgebase, a standard repository of protein 
sequences related information [25]. BIOPEP, the database of 
bioactive peptides used in the present study, is a ^\ i(lely recognised 
and utilised tool for the bioinformatics based prediction of 
bioactive peptides in a given amino acid sequence [22,27,30,31]. 
The associated bioactivity of the peptide sequences listed in the 
BIOPEP database is documented and continually updated based 
on previous and on-going in vitro and in vivo studies [22,26,32,33]. 
The resultant peptides generated after simulated gastrointestinal 
digestion were predicted using Peptide Cutter, an enzymatic 
cleavage prediction software [28], that is hosted by the ExPASY 
server, a standard tool used in bioinformatics and mass 
spectrometry-based studies [34]. 

The findings of the present study are based on an in silico 
gastrointestinal digestion prediction-model. The model is based on 
the amino acid sequence (primary structure) of the intact proteins 
and knowledge about the specificity of proteases in the gastroin- 
testinal tract. Being an in .silico model, it cannot be concluded with 
certainty that the purported bioactive peptides will be generated 
after the actual in vivo gastrointestinal tract digestion of gut 
endogenous proteins. However, there are similarities between data 
generated in the presently reported study and data generated in 
other in silico, in vitro and in vivo studies. For example, in the present 
study P-casein was found to be the greatest potential source of 
bioactive peptides, including ACE-inhibitory peptides. This 
finding is consistent with another in silico study that examined a 
range of food proteins and predicted that bovine caseins were the 
greatest source of ACE-inhibitory peptides [35]. In addition, 
Boutrou et al [1 1] investigated the kinetics of the release of 
peptides from either casein or whey proteins in the jejunum of 
humans, and reported that fi-casein released both larger numbers 
of bioactive peptide fragments and generated peptides with a 
diverse range of bioactivities. Moreover, and in line with our own 
findings, in vitro studies have shown that the antihypertensive 
peptides VPP and IPP present in the amino acid sequence of 
bovine P-casein, which are known to be released during 
lactobaciUi-based fermentation of milk [36], are not released 
during enzymatic digestion using an in vitro digestion model that 
simulated digestion in the gastrointestinal tract [37]. 

Overall, the in silico technique used in the presently reported 
study does demonstrate that large numbers of bioactive peptide 
sequences do exist within the amino acid sequences of endogenous 
proteins that may be cleavable by the digestive enzymes and it is 
likely that in the process of digestion within the gut, bioactive 
peptides would be liberated from the gut endogenous proteins, 
particularly given that it is known from in vivo studies that as much 
as 80% of the endogenous protein secreted into the gastrointestinal 
tract is digested and reabsorbed [14,15]. The presentiy reported 
study does not include analysis of two major contributors to the 
non-dietary nitrogenous losses in the gut, namely, bacterial 
proteins and the sloughed epithelial cells. Also factors that may 
influence in vivo protein digestion in the gastrointestinal tract, such 
as, the tertiary structure of the proteins, the effects of food 
processing on protein digestion, and the influence of bacterial 



enzymatic digestion have not been taken into account. An attempt 
has been made, however, to analyse a range of gut endogenous 
proteins secreted at different sites within the gut and with known 
amino acid sequences. 

All of the dietary and gut endogenous proteins evaluated in the 
present study contained large numbers of peptide sequences within 
the greater amino acid sequence of the intact protein that 
corresponded to the sequences of known bioactive peptides, at 
least based on the BIOPEP bioactive peptide database [22]. 
Furthermore, the total number of bioactive peptide sequences 
present in the overall amino acid sequence of the intact proteins 
varied across both dietary and gut endogenous proteins, although 
the range was much greater for the endogenous proteins. The 
mucin proteins generally contained the greatest number of 
bioactive peptide sequences while the hormone molecules 
contained the least. In comparison with the dietary proteins, 16 
of the 26 gut endogenous proteins contained a similar or greater 
number of bioactive peptide sequences per molecule. This suggests 
that based on amino acid sequence, the gut endogenous proteins 
may contain quantitatively significant amounts of bioactive 
peptides. In general, for both the food and gut endogenous 
proteins, smaller proteins contained comparatively fewer bioactive 
peptide sequences when compared to the larger proteins. The 
latter observation indicated, not unexpectedly, that the longer the 
amino acid chain of a protein, the higher the probability of finding 
peptide sequences that correspond to previously studied and 
reported bioactive peptides documented in the BIOPEP database 
[22]. 

If the gut endogenous proteins and food proteins are considered 
in terms of the potential bioactive profile (the relative number of 
bioactive peptide sequences within each bioactivity category), both 
gut endogenous and dietary proteins were similar with ACE- 
inhibitory peptide sequences being present in the greatest 
numbers. This may be attributed to the fact that ACE-inhibitory 
peptides have been researched more extensively in comparison to 
all of the other bioactivities and hence the bioactive peptide 
database used in the present study contains a much higher 
proportion of known ACE-inhibitory peptides as compared to 
bioactive peptides with other activities [2,38]. Both the gut 
endogenous proteins and the dietary proteins seem to contain 
remarkably similar relative numbers of bioactive peptides within 
each activity category particularly given the very different amino 
acid sequences across the different proteins. For example, ACE- 
inhibitory peptides comprised 43-75% of the total number of 
bioactive peptides found across the proteins while inhibitor 
peptides comprised 10-29'X), antioxidative peptides comprised 
3-14%, stimulating peptides comprised 3-13% and hypotensive 
peptides comprised 0-2%. Overall, large numbers of bioactive 
peptide sequences were observed in the intact gut endogenous 
protein amino acid sequences. In comparison to the dietary 
proteins examined in the present study, gut endogenous proteins 
were similar in terms of being a potential source of bioactive 
peptides. 

Significant numbers of bioactive peptides were predicted to be 
released after gastric digestion (based on an in silico digestion 
model) of both food and gut endogenous proteins; however the 
numbers predicted were only 0-3. 5 "X) (average across all examined 
proteins = 1.0%) of the total number of bioactive peptide amino 
acid sequences identified in the intact amino acid sequences of 
each protein. It would appear, based on the in silico prediction used 
in the present study, that for both the dietary and gut endogenous 
proteins most of the predicted bioactive peptide sequences present 
in the intact proteins would not be released during gastric enzymic 
digestion. In terms of the bioactive peptides that were predicted to 
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Table 6. Amino acid^ sequences of bioactive peptides predicted to be released after mouth to ileum digestion of selected 
proteins based on an in sllico digestion model. 





Protein 


Bioactivlty^ 


Gastric 


Gastric+Small intestinal 


Small Intestinal 


Gut endogenous 


Serum albumin 




2 


KA, lA, LF, QK, GM, 
RL, VE, AA 


lA, LF, PR, LY, QK, AW, 
EK, GM, EY, AR, EK, KP, 
VR, VPK, VE, VK, AA 


LF, LY, AW, NY, EY, AR, 
VF, VPK, TF 




4 


LV, LL 


LV 






6 


LHT 


LHT, KP 


LK LY, TY 




8 


KA, LL 




EE 




9 






EE 


Somatostatin 




2 


AA 


AA, QK 


NF, TF 




6 


EL 


EL 




Dietary 


P-casein, bovine milk 




2 


HL, PLP 


AR, VK, HK EMPFPK HL, PLP GPFPIIV 






4 


LI, LL 


LI, LL 






6 


HL 


HL 






8 


VA, LL 


VA, LL 




Ovalbumin, Chicken egg 




2 


KE, YAEERYPIL, KG, EK 


LY, LW, EK, VY, PR, GR 






4 


LL 


LL 






6 




LK, LY, VY 






8 


LL 


LL 





'All amino acids are denoted using 'the one-letter notation for amino acid sequences' from the International Union of Pure and Applied Biochemistry and International 
Union of Biochemistry, 1 971 : Alanine = A; Arglnine = R; Asparagine = N; Aspartic acid = D; Cysteine = C; Glutamic acid = E; Glutamine = Q; Glutamlne or Glutamic acid = Z; 
Glycine =;G; Histidine = H; lsoleucine = l; Leucine = L; Lysine = K; Methionine = M; Phenylalanine = F; Proline = P; Serine = S; Threonine = T; Tryptophan = W; Tyrosine = Y; 
Valine = V. 

^2 ACE-inhibitor, 4 stimulating (glucose uptake- -vasoactive substance release), 6- antioxidative, 8 inhibitor (dipeptidyl peptidase IV inhibitor-, dipeptidyl-aminopeptidase IV 
inhibitor-, dipeptidyl carboxypeptidase-, CaMPDE-, neuropeptide-), 9 hypotensive. 
doi:l 0.1 371 /journal.pone.0098922.t006 



be released after gastric digestion, the gut endogenous proteins 
appeared to be similar to the dietary proteins both in terms of the 
total number of predicted bioactive peptides and the number of 
predicted bioactive peptides normalised for the amino acid chain 
length of the protein (Ad values). 

The number of bioactive peptides predicted to be released after 
gastric and small intestinal digestion combined were considerably 
higher compared to gastric digestion alone but were stiU much 
fewer in comparison to the number of bioactive peptide amino 
acid sequences identified within the intact protein (3.3% of the 
total number of the identified bioactive peptides were predicted to 
be released across protein sources). It was predicted that after 
combined gastric and small intestinal digestion, many endogenous 
proteins were an equal source of bioactive peptides compared to 
the selected dietary proteins with a mean Ajj across all of the 
endogenous proteins of 23 compared to 22 for the dietary proteins. 
Moreover at least two of the endogenous proteins had a greater Ajj 
value in comparison with P-casein, a known rich source of 
bioactive peptides. 

Not all gut endogenous proteins are secreted ubiquitously 
throughout the gastrointestinal tract [25]. For example while 
serum albumin is known to be secreted into both the stomach and 
the small intestine [39,40], trypsin is only secreted into the 



duodenum and therefore is only subject to digestion in the small 
intestine. For proteins that are secreted in the small intestine, 
digestion in the gastrointestinal tract was predicted based on an in 
silico model for small intestinal digestion alone with the two major 
intestinal enzymes trypsin and chymotrypsin. The number of 
bioactive peptides predicted to be present after small intestinal 
digestion alone were much fewer in comparison to those predicted 
after both gastric and intestinal digestion. For example, the total 
number of bioactive peptides predicted to be released after small 
intestinal digestion of serum albumin (14 bioactive peptides per 
protein molecule) was much lower than that predicted for gastric 
and intestinal digestion (22 bioactive peptides per protein 
molecule). Despite this, the results of the present study would 
predict that gut endogenous proteins secreted into the small 
intestine also appear to be significant sources of bioactive peptides. 

After small intestinal digestion alone, the predicted released 
bioactive peptides possessed fewer bioactivities. For example, 
across all of the examined proteins (gut endogenous and dietary) 
the bioactive peptides predicted to be released after gastric and 
intestinal digestion had collectively up to 7 different bioactivities, 
while after small intestinal digestion alone, the predicted bioactive 
peptides collectively had only up to 3 different bioactivities, with 
an exception of serum albumin and mucin-2 which were predicted 
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Table 7. Predicted quantity of bioactive peptides (mg/d) released after digestion of either dietary proteins or gut endogenous 
proteins In the gastrointestinal tract. 







Predicted mean quantity of 
bioactive peptides released^ 
(mg/g protein) 


Estimated quantity of bioactive peptides released after 

digestion in the gastrointestinal tract 

(mg/d) 


Protein source^ 


Dietary protein 


P-casein, bovine mill< 


87 


Dairy 348 


Gliadin and Glutenin, wheat 


14 


Wheat products 196 


Glyclnin, soya 


69 


Soya products 207 


Ovalbumin, chicken egg 


54 


Chicken egg products 324 


Actin and Myosin, chicl<en meat 


59 


Chicken meat 767 


Predicted (total) amount of bioactive 1842 
peptides (mg/d) 


Gut endogenous protein^ 


Mucin, Serum albumin, Pepsin A, Gastrin and 
Lysozyme C 


56 


Predicted (total) amount of bioactive 2689 
peptides (mg/d) 



^Estimated based on the predicted total number of bioactive peptides released after gastric and small intestinal digestion (from Table 4), and the moles and molar 
masses of the respective proteins; and considering that the majority of the predicted bioactive peptides are 'dipeptides'. All of the evaluated food proteins are used as a 
model for the remaining proteins in the respective food product. 

^The model diet Is based on a recommended diet for a healthy adult weighing 60 kg, supplying 0.66 g/kg body weight protein per day, amounting to a protein intake of 
40 g per day, designed to comply with the FAO recommendations 1]; whereby dairy, wheat, soya products, chicken egg products, chicken meat contribute 4, 14, 3, 6 
and 1 3 g of protein respectively; Protein content of food products estimated based on the United States Department of Agriculture (USDA) Nutrient Data Laboratory 
database 42]. 

^Calculated based on Moughan, 201 1 41], using the amount of gut endogenous protein nitrogen secreted into the gastrointestinal tract, but, excludes protein nitrogen 
derived from epithelial and bacterial cells. 
doi:1 0.1 371 /journal.pone.0098922.t007 



to release bioactive peptides in two additional bioactivity 
categories. Furthermore, for proteins that are secreted in both 
the stomach and small intestine, the same protein was predicted to 
release difierent bioactive peptide sequences in terms of total 
number and amino acid sequence depending on the site of 
digestion (gastric+smaU intestinal vs. small intestinal alone; 
Table 6). 

For the most abundantly predicted bioactivity, ACE-inhibition, 
based on the present in silico digestion model, it would appear that 
on a per molecule basis, gut endogenous proteins may be similar to 
dietary proteins in terms of the potential to release ACE-inhibitory 
peptides in the upper gastrointestinal tract as a result of digestion. 

The majority of the bioactive peptide sequences present in the 
amino acid sequence of the intact gut endogenous protein and 
after "m silico" digestion were di- or tri- peptides, while for the 
dietary proteins, bioactive peptides of 6 to 9 amino acids in length 
were also observed (Table 6). The 3 known opioid agonist peptides 
in P-casein (5 to 11 amino acid long, data not shown) were also 
longer in chain length than the average bioactive peptide chain 
length observed in the gut endogenous proteins. In terms of the 
amino acid composition of the gut endogenous proteins evaluated, 
it is of note that, many of them contain significant amounts of 
glycine or proline or both, and it has been reported that a high 
content of glycine and proline is related to a higher probability of 
finding bioactive peptide fragments [32] . 

This study makes no attempt to investigate the efficacy of 
bioactive peptides but rather provides an in silico prediction of the 
number and types of bioactive peptides that potentially can be 
generated in the gastrointestinal tract during digestion. 

To put the current findings into context an attempt was made to 
predict the amounts of bioactive peptides that may be released into 
the gastrointestinal tract per day from either dietary protein or gut 
endogenous protein sources (Table 7). For the daily dietary protein 



intake, the food proteins examined in the presentiy reported study 
were used as ingredients for a theoretical diet. This model diet was 
formulated to contain approximately 40 g of protein which 
represents the Food and Agricultural Organisation of the United 
Nations' (FAO) recommended daily protein intake for a healthy 
adult weighing 60 kg [1]. The proportion of each individual 
dietary protein was derived based on a model diet assumed to 
contain 127 g of dairy products, 128 g of wheat-based products, 
25 g of soya products, 44 g (1, medium sized) egg and 46 g of 
roasted chicken (the vegetables, fruits, fats and sugars in the diet 
were omitted from the present estimations as they contain 
negligible amounts of proteins). The amount of endogenous 
protein secreted into the gastrointestinal tract was estimated based 
on the reported amounts of gut endogenous protein nitrogen 
secreted into the gastrointestinal tract, but excludes protein 
nitrogen derived from epithelial and bacterial cells [41]. Based 
on the model diet, it is predicted that in a healthy adult, dietary 
proteins may contribute 1842 mg, while the gut endogenous 
proteins (excluding microbial protein and sloughed cells) may yield 
up to 2689 mg of bioactive peptides per day. Given that microbial 
protein and sloughed cells, which make up approximately two 
thirds of the total gut non-dietary protein, were not included in the 
latter prediction it is likely that the amount of bioactive peptides 
derived from gut endogenous proteins would be much higher. 

In conclusion, based on an in silico prediction it would appear 
that gut endogenous proteins may be an important and diverse 
source of bioactive peptides, in comparison with food proteins, 
particularly given that gut endogenous proteins are likely to be 
present in the gastrointestinal tract at a more constant concentra- 
tion and composition than proteins derived from the diet. 
However, further in vitro and in vivo work is needed to corroborate 
the in silico predictions of the present study. 
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